Skip to content

[SPARK-5753] [SQL] add JDBCRDD support for postgres types: uuid, hstore and arrays#4549

Closed
lepfhty wants to merge 1 commit into
apache:masterfrom
lepfhty:master
Closed

[SPARK-5753] [SQL] add JDBCRDD support for postgres types: uuid, hstore and arrays#4549
lepfhty wants to merge 1 commit into
apache:masterfrom
lepfhty:master

Conversation

@lepfhty

@lepfhty lepfhty commented Feb 12, 2015

Copy link
Copy Markdown

PostgresQuirks:

  • added uuid as StringType
  • added hstore as MapType[StringType, StringType]
  • added ArrayType

JDBCRDD needed conversion functions (nonspecific to postgres) for ArrayType and MapType.

  • ArrayConversion uses the ResultSet#getArray(int).getArray() method, so it should generally work.
  • MapConversion assumes that ResultSet#getObject(int) returns a java.util.Map<String,String>. Maybe this is too strong an assumption?

Not sure how to run/write tests. Would appreciate pointers here.

Thanks!

@lepfhty lepfhty changed the title [SPARK-5753] add JDBCRDD support for postgres types: uuid, hstore and arrays [SPARK-5753] [SQL] add JDBCRDD support for postgres types: uuid, hstore and arrays Feb 12, 2015
@marmbrus

Copy link
Copy Markdown
Contributor

Thanks for doing this and sorry for the delay reviewing! Des H2 support any of these types? Would it be possible to add some tests?

@marmbrus

Copy link
Copy Markdown
Contributor

test this please

@marmbrus

Copy link
Copy Markdown
Contributor

@liancheng what is the status of the postgres docker tests. It would be great if we could verify these changes.

@SparkQA

SparkQA commented Mar 12, 2015

Copy link
Copy Markdown

Test build #28530 has finished for PR 4549 at commit 3f67996.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lepfhty

lepfhty commented Mar 12, 2015

Copy link
Copy Markdown
Author

not sure how or where to write tests. esp for the postgres-specific stuff. I'd appreciate some help and direction for this if you'd like me to write the tests.

h2 does have an array type that could be tested for the ArrayConversion. I'm guessing this goes into JDBCSuite.scala.

@marmbrus

Copy link
Copy Markdown
Contributor

We used to have docker tests for running postgres specific things but had to remove them for dependency reasons. @liancheng will have more information about the status there.

For the array type, yeah, it would be great to add something in JDBCSuite.scala. Thanks!

@liancheng

Copy link
Copy Markdown
Contributor

@marmbrus @lepfhty The integration tests were removed in #4872 because they cause dependency issue and are ignored from the very beginning anyway. I plan to but haven't migrated the JDBC integration tests to databricks/spark-integration-tests yet. Tracking the migration with databricks/spark-integration-tests#6.

@marmbrus

marmbrus commented Apr 3, 2015

Copy link
Copy Markdown
Contributor

@liancheng, can you provide pointers here on how tests for postgres could be added? I'd like to merge this feature, but I'm worried about doing it without tests.

I'd be fine resurrecting the tests in apache (now that the release is over maybe we have more time to fix the dependency issues) as this would let us test patches like this more easily.

@JoshRosen any idea if jenkins would support running docker based tests in the near future?

@JoshRosen

Copy link
Copy Markdown
Contributor

@marmbrus, I think that we should be able to get Docker set up on Jenkins soon, since I think it should be as simple as an apt-get install docker.io (@shaneknapp knows more about this).

@nickpoorman

Copy link
Copy Markdown

+1 for UUID type support.

@andrewor14

Copy link
Copy Markdown
Contributor

@lepfhty @marmbrus what is the status of this patch? It seems to be stale at this point and maybe we should just close it. I would recommend that we reopen it later if there is interest.

@marmbrus

Copy link
Copy Markdown
Contributor

Yeah, @lepfhty, very sorry to have let this go stale. If you have time to reopen this, I believe that we can try and get the integration test we have already written working again (now that our jenkin's cluster has docker support) and augment them to test your new code.

Basically start by reverting this patch: 76b472f

Then we can add tests for these specific python quirks.

@AmplabJenkins

Copy link
Copy Markdown

Can one of the admins verify this patch?

@jakajancar

Copy link
Copy Markdown

Really unfortunate that this patch (which seems unlikely to cause regressions) was held "hostage" by the lack of testing capabilities. By this logic pretty much all JDBC-related changes should be disallowed.

:(

@srowen

srowen commented Aug 2, 2015

Copy link
Copy Markdown
Member

@lepfhty do you mind closing this PR?

@lepfhty

lepfhty commented Aug 2, 2015

Copy link
Copy Markdown
Author

I'll try to rework this into JdbcDialects

@lepfhty lepfhty closed this Aug 2, 2015
@jakajancar

Copy link
Copy Markdown

@lepfhty Has any progress been made on this? Can you point me to a PR/branch/JIRA issue/... that I can subscribe to?

@lepfhty

lepfhty commented Aug 28, 2015

Copy link
Copy Markdown
Author

sorry i haven't gotten around to this. they changed their base classes. it might be easier to do as a spark-package... i could give it another try in a week or two

@mariusvniekerk

Copy link
Copy Markdown
Member

I've given this a shot in #9137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.